Multiplicative Models in Projection Pursuit
نویسندگان
چکیده
Friedman and Stuetzle (JASA, 1981) developed a methodology for modeling a response surface by the sum of general smooth functions of linear combinations of the predictor variables. Here multiplicative models for regression and categorical regression are explored. The construction of these models and their performance relative to additive models are examined. CHAPTER 0 INTRODUCTION In recent work, Jerome H. Friedman and Werner Stuetzle have developed a methodology of additive projection pursuit modelling. This dissertation examines the question for multiplicative modelling-how to accomplish it and when it is superior to additive modelling. Two general statistical problems are explored: categorical regression and classification, and regression. Chapters 1 through 3 deal with categorical regression and classification. The first introduces the problem and briefly reviews the method of Friedman and Stuetzle. The second describes the multiplicative model and gives several examples of its application. Finally, Chapter 3 discusses four related topics: the generalization to multiple classes, use of a multiplicative model as an extension to discriminant analysis, the choice of minimization criterion, and the relative performance of the additive and multiplicative models. Chapter 4 discusses the building of multiplicative models in regression and gives examples of their use. The appendix explains the numerical optimization techniques used by these procedures. Routines implementing all of these procedures have been written and have been integrated into the framework designed by Friedman, Stuetzle and Roger Chaffee for additive projection pursuit. CHAPTER ONE CATEGORICAL REGRESSION AND PROJECTION PURSUIT $1.1. The Categorical Regression and Classification Framework The first situation to be considered here is that of categorical regression and classification. A training sample wl, Xl), mr x2), . . . , (YN, xiv), (14 is observed, where xn is a p-dimensional vector of predictor variables associated with the rrth observation. Y, is a discrete variable indicating to which of K mutually exclusive classes the observation belongs (labelled 1 through K for convenience). The sample could be completely random or stratified on x. If the marginal distribution of Y is known, it could instead be stratified on Y. Categorical regression seeks to estimate the probability of the response Y falling into each class conditional on the value of x: pk(x)=Pr{Y=IcIx} l<k<K. (1.2) For example, in a business application, class 1 might represent those loan applicants who would default if granted a loan, while class 2 denoted those who would repay it in full. The vector x might include income, job stability and other personal factors that could affect repayment. The function fik (x) would indicate the probability of default given salary and other characteristics. Many applications require a decision rule that will identify the response class Y based on the predictors x. In the example such a rule would divide loan applications into “good credit risks” and “bad”, hopefully protecting the bank from unwise loans and loss of money. Since any decision rule would not be completely accurate, classification errors would result in various losses. Labelling a good risk as bad deprives the bank of a profitable loan opportunity. Identifying
منابع مشابه
انجام یک مرحله پیش پردازش قبل از مرحله استخراج ویژگی در طبقه بندی داده های تصاویر ابر طیفی
Hyperspectral data potentially contain more information than multispectral data because of their higher spectral resolution. However, the stochastic data analysis approaches that have been successfully applied to multispectral data are not as effective for hyperspectral data as well. Various investigations indicate that the key problem that causes poor performance in the stochastic approaches t...
متن کاملThe overall efficiency and projection point in network DEA
Data Envelopment Analysis (DEA) is one of the best methods for measuring the efficiency and productivity of Decision Making Units (DMU). Evaluating the efficiency of DMUs which have two or several stages by using the conventional DEA models, is equal to consider them as black box. This method, omits the effect of intermediate measure on efficiency. Therefore, just the first network inputs and t...
متن کاملNonlinear Principal Component Analysis, Manifolds and Projection Pursuit
Auto-associative models have been introduced as a new tool for building nonlinear Principal component analysis (PCA) methods. Such models rely on successive approximations of a dataset by manifolds of increasing dimensions. In this chapter, we propose a precise theoretical comparison between PCA and autoassociative models. We also highlight the links between auto-associative models, projection ...
متن کاملEfficient Parametric Projection Pursuit Density Estimation
Product models of low dimensional experts are a powerful way to avoid the curse of dimensionality. We present the "under complete product of experts" (UPoE), where each expert models a one dimensional pro jection of the data. The UPoE may be inter preted as a parametric probabilistic model for projection pursuit. Its ML learning rules are identical to the approximate learning rules proposed ...
متن کاملInterpretable Projection Pursuit*
The goal of this thesis is to modify projection pursuit by trading accuracy for interpretability. The modification produces a more parsimonious and understandable model without sacrificing the structure which projection pursuit seeks. The method retains the nonlinear versatility of projection pursuit while clarifying the results. Following an introduction which outlines the dissertation, the fi...
متن کامل